A Corpus-Based Concatenative Speech Synthesis System for Turkish

نویسندگان

  • Haşim SAK
  • Tunga GÜNGÖR
  • Yaşar SAFKAN
چکیده

Speech synthesis is the process of converting written text into machine-generated synthetic speech. Concatenative speech synthesis systems form utterances by concatenating pre-recorded speech units. Corpus-based methods use a large inventory to select the units to be concatenated. In this paper, we design and develop an intelligible and natural sounding corpus-based concatenative speech synthesis system for the Turkish language. The implemented system contains a front-end comprised of text analysis, phonetic analysis, and optional use of transplanted prosody. The unit selection algorithm is based on commonly used Viterbi decoding algorithm of the best-path in the network of the speech units using spectral discontinuity and prosodic mismatch objective cost measures. The back-end is the speech waveform generation based on the harmonic coding of speech and overlap-and-add mechanism. Harmonic coding enabled us to compress the unit inventory size by a factor of three. In this study, a Turkish phoneme set has been designed and a pronunciation lexicon for root words has been constructed. The importance of prosody in unit selection has been investigated by using transplanted prosody. A Turkish Diagnostic Rhyme Test (DRT) word list that can be used to evaluate the intelligibility of Turkish Text-to-Speech (TTS) systems has been compiled. Several experiments have been performed to evaluate the quality of the synthesized speech and we obtained 4.2 Mean Opinion Score (MOS) in the listening tests for our system, which is the first unit selection based system published for Turkish.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Forward Masking Phenomenon in Concatenative Speech Synthesis

The approach described in the paper tries to get more knowledge to the concatenative text-to-speech system design. The knowledge is based on masking phenomenon of the inner ear, particularly of its temporal (forward) masking properties. Designing such knowledge-based system is suggested to use in the unit selection-based speech synthesis, as contemporary a prominent technique in concatenative s...

متن کامل

Development of Concatenative Syllable based Text to Speech Synthesis System for Tamil

This paper addresses the problem of improving the intelligibility of the synthesized speech in Tamil TTS synthesis system. The human speech is artificially generated by Speech synthesis. The normal language text will be automatically converted into speech using Text-to-speech (TTS) system. This paper deals with a corpus-driven Tamil TTS system based on the concatenative synthesis approach. Conc...

متن کامل

Concatenative Speech Synthesis: A Review

The primary objective of this paper is to provide an overview of existing Concatenative Text-To-Speech synthesis techniques. Concatenative speech synthesis can be broadly categorized into three categories, Diphone Based, Corpus based and Hybrid. Diphone based speech synthesis relies on different signal processing techniques such as PSOLA, FD-PSOLA etc. These signal processing techniques introdu...

متن کامل

Introduction to multilingual corpus-based concatenative speech synthesis

This tutorial paper addresses foreign-language support in corpus-based concatenative text-to-speech systems. We give an overview of application domains where strictly monolingual speech synthesis is not sufficient and where multilingual text-to-speech is required or highly desirable. We describe two approaches to multilingual corpus-based speech synthesis: phoneme mapping on the one hand, and t...

متن کامل

Design of English to Hindi Corpus Based Text Conversion and Hindi Text to Speech Synthesis

English is a global language but is understood by few percentage of population in India. It continues to remain a barrier for rural population to learn and compete at a global level. Machine translation helps people from different places to understand an unknown language without the aid of human translator. A Text to Speech system generatesspeech from text given as input. The proposed system wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006